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In this work we are focusing on listing out various works in the 
understanding of various parameters and context to get the overview of stock 
market analysis in the context of machine learning (ML) and deep learning 
(DL) models. The work focusses on the stock market analysis along with 
methodologies and algorithms used to understand the trends and the 
corresponding results as part of those studies. The importance of this work is 
to summarize and analyse the parameters which are highly influenced the 
understandingof the stock market trends. The outcome of the work is 
understanding the important factors which directly and indirectly influences 
the stock value raise and drop. The work highlights the methodologies and 
the algorithms used to stock market data analysis and efficient and effective 
recommendation of stable stocks to the customers. Further we are listing out 
the research gaps and future enhancements of the studies which are left over 
in the earlier works. The work pops up the limitations of some of the works 


in the existing works along with significance of hyper parameter techniques 
to clearly identify the features through which we can get more possibilities 
of better analysis of the data. 
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1. INTRODUCTION 

In the study of stock data analysis there is always a confusion as the market is highly volatile and 
not only that as the analysis involves numerous parameters such as market conditions, company background, 
customer retention rate, political conditions, price hikes in the country. Other than these things many aspects 
influence the stock market trading like foreign exchange, ownership changes in multinational companies or 
corporations (MNC) companies, policies issued by governments, securities and exchange board of India 
(SEBI) and reserve bank of India (RBI) regulatory decisions, changes in the interest rates, foreign investors 
and domestic investors decisions, pandemics, natural disasters, prices of gold, and bonds the list goes on. 
Even the stock market analyst who are having many years of experience in trading could not be able to 
suggest the investors to opt for the shares which gives benefits to the customers. To study and analyse the 
stock market data the method followed should be qualitative to identify the internal patterns of the data and 
result the highly influential parameters to observe the highly volatile prices of stock market data. 

Most of the customers and market is revolving around online trading which is accessible to 
everyone, and the members can easily invest form any where and with little knowledge in the stock market 
can get benefitted but the reality is way differently. So, the opportunity here is if we could suggest some 
better predictions for the customers, they can get more benefits such observations and recommendations. But 
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as mentioned earlier the stock market movement is not sequential, complicated, and very volatile in nature. 
Thus, expectation of stock market with strong methods and algorithms to guide the investors that when can 
they sell their stocks at most elevated prices and purchase the stocks with least expensive prices is very much 
needed. In general, for making the decisions the professional traders follow traditional methods like the 
reviews of the company based on the revenue, market position of the company, and growth rate. In case of 
some technical observation of the stock market data involves relation between stocks and the correspond 
prices to suggest that when can a customer enter and leave the stocks. The most used supervised 
classification methods in the stock market analysis are k-nearest neighbour (KNN), Naïve Bayes, support 
vector machine (SVM), decision trees, and random forest. In case of the regression models we have observed 
the usage of linear and multiple regression. In case of deep learning (DL) techniques, the recurrent neural 
networks (RNN) and long short-term memory (LSTM) are most elected models in the literature. The majority 
of the researchers used Naïve Bayes, SVM. In the context of supervise learning mechanisms and recorded 
some results in the estimation of stock market predictions andsuggested some pointers to the investors. 

While performing the literature review few points were caught our attention like selecting the model 
and predicting the stock market prices in a best way is fundamental requirement but that is not the end, in 
what way we identify highly influential parameters in the data and reducing the unused dimensions and 
complexity of the models while processing the data with the above-mentioned algorithms. To add the value 
to our research we are highly focusing on the optimization techniques that can be employed while creating 
the models and suggesting the most reasonable stocks to the investors. To implement the research work we 
have elected the pyspark on top of databricks community platform which provide the most advanced cluster 
support with the configuration of 15.3 GB memory, 2 cores with 1 DBU. The runtime version is with 10.4 
LTS which includes apache spark 3.2.1 and scala 2.12. 


2. LITERATURE REVIEW 

Nabipour et al. [1] employed an approach of continuous data for the features is used, the metrics 
opted for quantifying the results are Fl-score and in case of machine learning (ML) algorithm usage observed 
that naive bayes and decision tree are given least performance which is about 68%, but in case of DL models 
such as RNN and LSTM the results were encouraging with 86% predictions. In line with the above work our 
analysis is this, as Naive Bayes always look for the equality of all the features and all the weak learners were 
part of the prediction but in the stock market the case is different so the less predictions were recorded due to 
this property of model the solution for this is in normal conditions, we can opt for this but in the abnormal 
conditions the naive bayes is not recommended. The fact here that even the usage of decision tree model also 
ends up with 68% Fl-score which is a combination of precision and recall where there is a scope of accuracy 
and sensitivity of the model, we can make use of those measures and leverage the results to better understand 
the source data. In case of DL models, the results are better with 86% which shows that RNN of LSTM could 
be able to get the betterment due to the time series nature of the source data and the property of retention of 
the data, but there is an issue with this approach is vanishing gradient issue which we are going to discuss 
extensively in this research. 

The second approach opted is binary data consideration which has given the better predictions when 
compared with the first approach, the quantification of results was like in Naive Bayes and decision trees the 
predictions were 85% and in case of RNN and LSTM it is around 90%. The second approach has recorded 
the better predictions and faster completion of the model running is observed. The approach followed here is 
adding of additional layer to identify the trend by scoping previous and current continuous value which is 
having the notion of predicting the trends in the newly recorded data. Shah et al. [2] proposed a framework 
based on the DL architectures. The methodology used in this approach is applicability of convolutional 
neural network (CNN) and LSTM, the novelty here is usage of hybrid CNN on top of LSTM and dense 
layers to predict the prices of the stock market index on NSE. The measures observed here are like R-square 
value is 0.989, mean average error (MAE) is 168.558, mean absolute percentage error (MAPE) is 0.0234 and 
root mean squared error (RMSE) is 199.076 all these measures were observed in training of the model. In 
case of the model testing R-square is 0.943, MAE is 242.418, MAE is 0.0310, and RMSE is 413.902. 

Our observation from this work is that in general RNN and LSTM were two commonly used 
approaches but in this work the strategy is opposite like CNN and LSTM combination, the basic idea behind 
the LSTM usage is to process and make predictions based on the sequence of the data. CNN helps us to 
exploit the spatial correlation [3] in data and works good with images and speech. The notable property here 
is CNN can remember much longer sequence and competent enough with LSTM even better than [4], 
employed a prediction model using DL models. The results were obtained on top of multilayer perceptron 
(MLP), RNN, LSTM and CNN on the stock market data of NSE-tata motors which belongs to the 
automobile sector. The MAPE values using auto regressive integrated moving average (ARIMA) were like in 
case of Maruti 20.66, for HCL it is 24.69, and Axis Bank it is 19.64. Where as the MAPE observed with 


Stock market analysis with the usage of machine learning and deep ... (Seethiraju L. V. V. D. Sarma) 


554 m) ISSN: 2302-9285 


respect to DL models were observed like with RNN, Maruti-5.82, HCL-5.40, and Axis Bank-11.64. In case 
of LSTM applicability Maruti-6.37, HCL 6.97 and Axis Bank 8.13. In case of CNN applicability 
Maruti—5.36, HCL-6.42, Axis Bank-7.94. 

The observation here is in ARIMA usage the results shows that there is no identification of 
underlying dynamics, where as the usage of CNN done well compared with other 3 models (RNN, LSTM, 
and ARIMA) in terms of capturing the underlying dynamics in a better way. The MAPE can be referred as 
absolute percentage deviation which measures accuracy of forecast system this is a poor accuracy indicator 
and always low value indicates the good results. The work could have been better in case if we use weighted 
MAPE and symmetric MAPE. Prerana et al. [5] suggested various aspects to consider in the betterment of 
predictions in the stock market analysis with the usage of DL algorithms. We have observed that while 
applying the random forest (RF) algorithm to predict the stock market data more parameters can be elected to 
understand the underlying relations in a better way. As RF can adopt additional randomness to the data and 
search for the best feature of the data. The other research gap observed in this work is that the work is not 
scoped the social media data and news articles, tounderstand the markets in a near possible best way. 

The generic approach used in the stock market trading is to depend on the existing prices data and to 
predict the future changes in the prices, financial expectations is like a signal processing problem which is 
not simple because of less samples, high noisy data, lack of stationarity and non-linear category of the data 
[6]-[8]. In the consideration of long periods of the data to analyse stock market data for real times 
transactions may not always give the right results as the conditions over the period changes abnormally. But 
the helpful observation here is if the stock market analysts could be able to observe the features over the 
period of the time, they can get valuable information to suggest the elevations and downfall rates of the 
stocks. To understand the stock market data behaviour various researchers has given their ideas such as 
applicability of feed forward neural networks [9], SVM [10], and RNN [11] for stock market predictions. 

The SVM is suitable for multi-dimensional data processing with kernel trick methodology and done 
well with separation of the data and if we are not having any idea about the source data. The benefit of 
electing SVM is the algorithm is having better accuracy in the results. At the same time the main problem 
with SVM is not suitable for large data sets with noisy data so much time require to pre process the data. The 
research works [12]-[14] shown the applicability of neural networks to forecast the closing prices of the stock 
market data for the next day, the short span of the observations also sometimes required to take instant 
decisions. The observation related to this research work is the selection of DL model is any kind of the 
functionality can be easily achievable, the visual presentation is explainable to the functional users. The 
usage of activation functions and cost functions are simple and powerful to exploit the results in a structured 
way. The methodology in the work focus on comparison of efficiency of predictions in the context of simple 
artificial neural network (ANN) model and RF for various and proved that the ANN outperforms compared 
with RF. The implementation aspects focused on applicability of RNN, LSTM, CNN, and multilayer 
perceptron. In some cases, LSTM outperformed compared with other models, in specific cases CNN 
outperformed compared with other models. 

The election of the algorithms such as RNN and LSTM with finetuned parameter tuning techniques 
leads to better understand the stock market data. The text-based approach is required as the origin companies 
provided a set of customers who are continuously following and reading the news published in those papers. 
An Ai-based stock market prediction using ML algorithm, the authors proposed a method of recommending 
certain sticks based on the closing price of the stock values [15]. The algorithms utilized in this work are 
holt-winter algorithm (HWA), RNN and recommendation system to help the investors about the best stocks 
to consider. In HWA triple exponential smoothing 3 basic things are taken into consideration such as base 
level, trend level and seasoning factor. In case of RNN usage specifically used LSTM approach, in the 
recommendation system which a subclass of information filtering system that seeks to predict the rating of 
the stocks based on certain parameters. The RMSE value as a part of RNN implementation is calculated and 
used for accuracy in predictions, along with this the approach used in this research is weighted 
recommendations which will consider various factors to compare the stock prices. 

The observations we made from this approach are, usage of RNN on top of LSTM is a common idea 
and the research is not drawing any significant results from this approach. The usage of RMSE to suggest the 
stocks to the investors is a good measure, but we are proposing the usage of R? and Adjusted R? to penalize 
the features which are not really contributed to the analysis. The work by Anani and Samarabandu [16] of 
MLP usage in stock market analysis with greatest average directional accuracy results 65.87% gives the 
future stock moments with hybrid model by integrating fundamental and technical analysis of stock market. 
The observation here is the work can only be focused on downtrend of the stocks, we could revise the 
approach to capture the raise in the stock prices. The research works [17]-[21] highlighted usage of new 
articles for forecasting the stock values keeping in mind the sentiments of the customers. Table 1 shows the 
comparative analysis of various research works in stock market analysis. 
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Table 1. Comparative analysis of various research works in stock market analysis 


Work Theme Methodology Results Comments 
Stock market trends using Usage of ML and Applicability of The outcome is Complexity is more and 
ML and DL algorithms via DL algorithms continuous data and binary data usage usage of other data pre- 
continuous and binary data, binary data to on the algorithms processing techniques is 


a comparative analysis 


A stock market 
framework based 
deeplearning architectures. 


trading 
on 


A novel ai-based stock 
market prediction using ML 
algorithm 


DL networks for stock 
market analysis and 
prediction: methodology, 
data representations, and 
case studies 

Harvesting social media 
sentiment analysis to 
enhance stock market 


prediction using DL 


Stock market analysis using 
LSTM in DL 


Implementation of 
various DL 
architectures 


Implementation of 
time-series 
models along with 
neural networks. 
Implementation of 
DL algorithms for 


stock market 
analysis 
Identification of 
stock prices 
correlate with the 
expressed 
opinions in 
famous social 
media. 

Usage of RNN 
with LSTM to 
track the stored 


stock prices 


observe the model 
behaviour 
Applicability of 
CNN and LSTM in 
the context of time 
series models. 


Applicability of 
Holt-Winters 
algorithm 
Applicability of 


unsupervised feature 
extraction methods 


Applicability of ML 
and DL algos with 
SVM, MNB 
classifier, LSTM 


Looping of the data 
for better 
understanding of the 
patterns of the data. 


has given better 
predictions. 
Used MAPE so as 


to estimate the 
accuracy of the 
predictions. 
Weighted 
recommender 
system with 
RMSE measure. 
NMSE, RMSE, 
MAE and MI 
measures were 
used. 

Usage of 


sentiment polarity 
to find out the 
sentiments impact 
on stock market 


Usage of 
optimizer Adam 
and MSE 


one research gap. 


Only used MAPE, the 


possibility of | Adj-R? 
applicability gives the 
understanding of strong 
learners. 


LSTM can be used with 
more historical and current 
data; other measures can be 


applied. 

Canbe experimented with 
learning rate, usage of 
regularization 


Sentiment polarity may fail 
in few cases, such cases 
were not referred. 


90 days data can be taken, 
global news can be 
integrated for better 
understanding of the data. 
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All the articles may not refer to the sentiments directly [22]-[30], such that sometimes the articles 
which are not describing any sentiments like positive, negative will be ignored by the model that does not 
imply that such articles have not at all influence the stock market conditions. In some cases, like though 
positive articles some time may impact the negative correlation of the stock prices otherwise, the negative 
news may not impact the stock prices proportionally. So, the stock market analysis model should address that 
we should elect the suitable algorithms which are less prone to the more generic conditions and end up with 
some 60% to 75% predictions. Some of the researchers [31]-[39] employed the techniques of SVM and KNN 
the authors provided the accuracy ranging from 65% to 81% and in the work [40]. While these works are 
providing some direct results and some useful insights through various ML and DL approaches with the 
observed predictions and accuracy levels but still, we can revise these models with additional parameter 
tuning in the context of features for specific recommendations and predictions on the stock market data. 

While going through these various research works and the corresponding results we realized that in 
the current scenarios of stock markets is there any correlation among the stock prices such as same company 
or same year started their trading or from the same domain or from the same turn over based aspects and 
election of some algorithms such as naive bayes in specific cases such as considering equal priority to all the 
features (weak learners) so that the predictions taking place are reasonable and valid without much 
assumptions. The best possible measure to understand the correlation among the stock prices is pearson’s 
correlation if the value is high then the stock prices raise can be expected in future. But above all, though the 
stock market prices may be move down or move up, but few companies stock values were still unchanged, 
and they are not affected by the market conditions. In case of market up and market down which are 
impacting the stock prices of various companies can be observed by the researchers with the help of MLor 
DL techniques and one can get the essence of the stock market data such as predictions, accuracy, variable 
importance plot, measures, and metrics so that the understanding of the data is possible [41]-[45]. To 
consolidate the work and the proposed work here is the stocks which are not getting any downfall because of 
market conditions and other parameters like price hikes, political conditions. A new model predictive 
controller for the wind energy conversion system [46]. Multipath delay commutator is introduced for 
enhancing the throughput and speed [47]. Data mining is the concept of gathering new information from 
huge sets of data. In past few years’ business development in knowledgeable discover database are rapidly 
high in the market. Because it’s processing is more useful in all kinds of business marketing field. However, 
before the arrival of data mining, business marketing is slightly slow in process, at that time business 
marketing is more dependent on Television ads, and sponsors then marketing executive [48]. 


Stock market analysis with the usage of machine learning and deep ... (Seethiraju L. V. V. D. Sarma) 


556 o ISSN: 2302-9285 


3. OBSERVATIONS 

The summary of the above works can be observed in few points as below, there some tool tip 
aspects we have observed and projecting here. Various characteristics of the stock market data like noisy data 
nonstationary, no linear nature over the high time duration may not be helping the researchers to understand 
the patterns of the data and expecting the good predictions and accurate results so need some focus on this 
aspect for better predictions. Applicability of feed forward networks, SVM and RNNs to analyse and 
estimate the stock market data in terms of predictions and accuracy were used majorly as they are good fit for 
multi-dimensional data and recursive processing of the data. 

The common point observed in many of the works is in the implementation of models, the common 
feature used is the closing prices of the stock market data to forecast the immediate day prices instantly 
which is not suitable for all the company stock, so the integration of other factors in the prediction is 
mandatory here. The comparative analysis of ANN models with RF in the study of stock market data and 
there is a slight better performance in ANN compared with RF. The other context focused on RNN, LSTM, 
CNN and multilayer perceptron. In specific cases LSTM outperformed compared with other models, in some 
other specific cases CNN outperformed compared with other models. The proposal here is there should be 
some mechanism like the suitability of these CNN, RNN, and LSTM in such a way that the researchers 
should be able to adopt these algorithms with some automation or with solid proven assumptions for the 
acceptable range of predictions. Due to this lot of time and technical resources can be optimized in the 
process flow. 

The other important aspect we have observed from the works done by various researchers is 
correlation among multiple stock prices may be positively or negatively. We are emphasizing on this point 
also which is really helping the stock market analyst to suggest some hidden factors while investing on some 
stocks (at least with prior correlations). The following observations we are making use in our proposed stock 
market data analysis for better predictions with correlation analysis, ML algorithms, DL algorithms and 
sentiment analysis based on the current market conditions and news feeds. 

a. Provision of huge and various source of stock market data we can expect better predictions. 

b. Integration of various statistical measures such as correlation, covariance matrix analysis might give 
better understanding of the stock market data. 

c. The usage of k-fold cross validation in model implementation, along with gridsearch CV will provide the 
extensive understanding of the feature importance and parameter validation in the stock market data 
analysis to suggest the better predictions. 

d. We can factor the predictionsand by integrating the other source of the data like news articles, and feeds, 
to identify the sentiments of the people which is influencing the stock market data. 

e. The second context is study of the correlation among the stocks could really impact a particular stock, 
which helps to have a conclusion on positive or negative correlations among the stocks. 

f. The third aspect is do the stock exchange announcements really impact all the stock values or else few 
companies are not at all impacted by these conditions, there should be some evidential approach to reach 
to the conclusion. 

g. The fourth context is investor age, employment level and income really having impact on the stocks 
buying and selling at a particular duration. 

By scoping the above four aspects, we are believing that we can come up with better predictions and 
accuracy values along with understanding the parameters and conditions which are really having high 
influence on stock market analysis as a driving force. The Figure 1 here gives the understanding of 
performance estimation with various algorithms like usage of continuous data in the analysis, usage of binary 
data, RNN and LSTM in the context of binary data usage, and CNN with LSTM usage. 


Performance 


98790% 


Continuous Data Binary Data RNN and LSTM on CNN with LSTM 
Binary 


Figure 1. Various methods and performance 
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4. EXPERIMENTAL SETUP AND RESULTS 

To quantify our proposed methodology, we are using spark with python implementation with the 
following configuration of data bricks cluster of single nodes.10.4 LTS (includes apache spark 3.2.1, scala 
2.12). The driver type we have opted for is 15.3 GB memory with 2 cores and 1 DBU. The dataset we have 
taken consists of stock data with date, open, high, low, close, adjusted close, volume, dividend, and split 
coefficient around 5273 records were there between 1998 and 2018 duration. We have observed the results 
on top of this data with the help of MLLib of pyspark library. We have elected decision tree regression and 
selected the measure RMSE with the result of 0.499624. 

The Figure 2 shows the spark job scheduling and other parameters such as the stages information, 
tasks running over the job and time taken to complete each task upto the level of job, the developer can 
estimate the performance parameters such as time requoirements and number of tasks established by spark to 
complete each job. The entire spark jobs logically referred as directed acyclic graph (DAG) as mentioned in 
Figure 3, which gives the understanding of number of transformations and actions elected by spark engine so 
as to complete the submitted job. 


Spark UI 
Page: 1 1 Pages. Jump to 1 Show 100 itemsinapage. Go 
Stages: Tasks (for all stages): 
Job id (Job Group) + Description Submitted Duration Succeeded/Total Succeeded/Total 
24 rmse = evaluator.evaluate(predictions) print("R.. 2022/06/18 2s 1/1 (1 skipped) © 1/41 skipped) 
(2514602347888021875_5531414410684456744_37 1b2cb66cef4376a88a69d8beb1c917) treeAggregate at Statistics.scala:58 06:26:46 
23 rmse = evaluator.evaluate(predictions) print("R.. 2022/06/18 035 1/1 T a 
(2514602347888021875_5531414410684456744_371b2cb66cef4376a88a69d8beb1c917) rdd at RegressionEvaluator.scala:125 06:26:46 
22 model = dt-fit(trainingData) predictions = mode... 2022/06/18 2s 2/2 (1 skipped) | 2/2 (1 skipped) 
(2514602347888021875_6259797108135916521_3129d8da7d5044c1b585a31ed73b9db3) collectAsMap at RandomForest.scala:678 06:26:21 
21 model = dt-fit(trainingData) predictions = mode... 2022/06/18 3s 2/2 (1 skipped) 2/2 (1 skipped) 
(2514602347888021875_6259797108135916521_3129d8da7d5044c1b585a31ed73b9db3) collectAsMap at RandomForest.scala:678 06:26:18 
20 model = dt-fit(trainingData) predictions = mode... 2022/06/18 2s /2 (1 skipped) 222.0 skipped) 
(2514602347888021875_6259797108135916521_3129d8da7d5044c1b585a31ed73b9db3) collectAsMap at RandomForest.scala:678 06:26:16 
19 model = dt-fit(trainingData) predictions = mode... 2022/06/18 3s 2/2 (1 skipped) | 2/2 {1skipped) 
(2514602347888021875_6259797108135916521_3129d8da7d5044c1b585a31ed73b9db3) collectAsMap at RandomForest.scala:678 06:26:13 
18 model = dt.fit(trainingData) predictions = mode... 2022/06/18 6s 2/2 (1 skipped) | 2/2 (1 skipped). 
(2514602347888021875_6259797108135916521_3129d8da7d5044c1b585a31ed73b9db3) collectAsMap at RandomForest.scala:678 06:26:07 
Figure 2. Spark job execution 
DAG Visualization 
D Show Adcitonsl Metis 
» Evert Timeline 
Metric Min 25th percentile Median 75th percentile Max 
Duration 03s Bs O35 Bs Bs 
Time Wrs rs Wrs 2.Dms ums 
‘Shuffe Virite Sze / Records 34118/5273 36418/5273 36418/5273 AN KB/S273 361 KB / 5273 


Showing 103 of 3ertries 


» Ago 


Figure 3. DAG visualizations 
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The pyspark usage on top of databricks helps us to perform various pre-processing techniques such 
as missing value imputations, feature scaling and encoding techniques, usage of machine leaning algorithms 
such as classification and regression along with unsupervised algorithms to reduce the dimensions and to 
cluster the similar data along with rule generation in the estimation of stock feature correlations. We are 
exclusively working on all these properties and algorithms available in pyspark libraries with the support of 
spark architecture, pyspark usage in the implementation with RDD and data frames usage. Most exciting part 
is the implementation can be done with spark SQL which provides simple yet powerful approach of getting 
the outcome in most optimized way. 


5. CONCLUSION 

In this work we have listed out various works done so far in the stock market analysis. The pointers 
here are study of the stock market data, applicability of ML algorithms such as SVM, KNN, RF and decision 
trees. The applicability of DL techniques such as CNN, RNN, and ANN like feed forward networks were 
observed. The typical models in the article processing like LSTM and multiplayer perceptron-based analysis. 
The further work we are focusing on 4 pillars like the first one is integration of news articles data in the study 
of the stock market data really helps us the sentiments associated with the people. The second point we were 
working out here, the possibility of any correlation kind of the measure can be found in the study of the data, 
is there any share values which are not at all effected by the market conditions and which are the parameters 
influence these aspects. The third point we are focusing on the features of community of the customers such 
as age, level of the employment and income ranges, in buying and selling of the stocks. The fourth point is 
that the common parameters or a strategic recommendation can we suggest to the investors of course it is too 
early and too complex to perform and achieve but we are researching on that also. The additional dimension 
we are focusing is on optimizations of the lines of code, estimation of the time and space utilization while 
running the jobs on the cluster of databricks are few factors to mention in the quantification of stock market 
analysis results. 
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